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Nowadays, the amount of data created by the government and public sector 
organizations is growing at an exponential rate. Data sharing and the 
interoperability of e-government systems pose technological challenges. The 
lack of technical interoperability prevents the successful exchange and 
sharing of information among public organizations. To meet this challenge, 


enhancing interconnection and communication between different public 
infrastructures is an essential condition. To optimize the provisioning of 
storage resources, software defined storage (SDS) solutions add flexibility 
and adaptability to the storage process by isolating the hardware from the 
software. Hyper-converged infrastructure (HCI) is an emerging set of SDS 
solutions that provide compute, network and storage in a single platform. 
This paper presents a storage HCI-based architecture to store public data 
from different public entities, enhance collaboration and improve technical 
interoperability. The relevance of this approach of e-government 
interoperability is to allow public organization to store their data in an 
efficient and flexible manner on one hand, and to participate to Morocco’s e- 
government project on the other hand. 
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1. INTRODUCTION 

Government digital transformation involves exploiting information and communication tools (ICT) 
as operating systems, storage resources and software to improve organization’s performance [1]. In parallel, 
e-government projects deal to enhance the daily public administration in order to deliver efficient public 
services. Furthermore, providing efficient public e-services remains delicate due to the repetitive transactions 
in separated government systems [2]. Indeed, optimizing the process of delivering public services requires a 
high level of coordination and integration between different e-government systems. To achieve a successful 
integration, introducing interoperability at all levels between public administrations is a fundamental step [3]. 

As known, e-government interoperability is the aptitude of national organization to work together 
and exchange data in an integrated way [4]. This includes many layers that permit to manage and share 
information between different governmental entities efficiently. Moreover, governments have addressed 
interoperability in several manners: policies, standards, laws, technical infrastructures [5]. Generally, 
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literature presents two main layers of interoperability, the technical and the non-technical one [6]. The first, is 
about hardware, software, platforms and infrastructures that enable communication among systems [7]. 
Technical interoperability deals with the technical points of connecting information systems, it includes many 
aspects: interconnection services, data integration, data exchange, data storage, middleware, and data 
warehouse. It is considered as the “starting point” of reaching e-government interoperability [8]. The second 
is about all other levels: organizational, semantic, and syntactic. To achieve a high level of interoperability, 
technical requirements should be meted, notably, data presentation, data type and metadata, communication 
protocol, application, and network infrastructure [2]. Indeed, Data centers as the backbone of IT 
infrastructure connect data and application under a network of compute and storage resources. In addition, 
managing resources in a homogenous data centers with an efficient cost is a major challenge of public 
administrations [9]. Further, software-defined storage (SDS) is a modern generation of storage system that 
allows economical features for storage unlike the traditional systems [10]. 

In Morocco, as well as many countries, the government is deploying many efforts to implement 
digital technologies in the public sector. As Morocco’s digital transformation is a key initiative, a strategic 
digital transformation is emerging progressively. Under the national plan ‘Digital Morocco 2020’, the 
government has invested in installing e-government platforms to enhance digital interaction with both 
citizens and business. In fact, recognizing the benefits of the digital transformation of government services, 
Moroccan government created the agency of digital development (ADD) in 2017. The main goal of ADD is 
to reinforce the digital field in the country and supports the growth of digital infrastructure. The agency is 
responsible for creating digital strategy and implementing general orientations to support the digital 
development in Morocco by 2025 [11]. At the same time, the COVID-19 crisis had a major impact on the 
digital transformation of Moroccan public services. It was an occasion for the Moroccan government to 
invest in digitalization and adopt digital-driven initiatives in the public service delivery [12]. 

Despite these digital initiatives, the country is facing challenges that compromise progress on e- 
government domain. According to UN e-government survey 2020, Morocco didn’t register a high e- 
government development index (EGDI) ranking: the 7" in Africa with a middle online service index (OSI) 
value 0,5235 [5]. More, Morocco doesn’t gain the benefits of digital transformation reform efforts 
successfully and the mission of the agency of digital development (ADD) is limited by the failure of a 
national digital strategy [13]. As a matter of fact, Morocco is in a low stage of developing data governance 
compared to other countries in the Gulf Corporation, especially the United Arab Emirates[14]. 

On the other hand, deficiency in coordination and interoperability between the public organizations 
influences the development of Moroccan public e-services. As reported in the national study conducted to 
assess the E-readiness index of Moroccan public e-services, results show that only 23% of national public e- 
services are fully digitized (the high level of maturity) and 60% of e-services are still informational (the low 
level of maturity). Plus, the granular assessment indicates that dependent e-services which data and 
interactions depend on other administrations (external or internal) have the least rank [15]. 

Relevant initiatives, frameworks and standards to develop Moroccan e-government interoperability 
were defined. Moroccan Researches most focused on semantic interoperability [16]-[19]. Another interesting 
work treats three important aspects for systems-of-systems interoperability which are: barriers, scopes and 
levels [20]. Other approach that considers the security level of technical interoperability between federated 
systems using identity management system was presented recently by [21]. Still, small researches address the 
challenges for storing government data, and getting cost-friendly storage solution. Public data storage 
management has not attracted much research attention thus far, and no standard architecture is presented for 
optimizing storage in public organizations. 

In this context, we propose an efficient architecture to address the issue of public data storage and 
introduce a new framework based on hyper-converged (HCI) concept to support technical interoperability. 
This modern infrastructure aims to manage the storage of data from different ministries and public agencies, 
achieve better collaboration and improve technical interoperability of information systems in the public 
sector. The proposed architecture is based on HCI concept to ensure the efficiency and the flexibility of the 
storage systems. This paper is structured as shown in: Section 2 presents an overview on government 
interoperability and provides a description of how data centers and IT infrastructures have evolved. In 
Section 3, the method of dimensioning the storage in the central data center is detailed. Section 4 describes 
the conceptual architecture of our approach. Finally, Section 5 presents the conclusion and future works. 


2. BACKGROUND REVIEW AND RELATED WORK 
2.1. E-government interoperability 

Previous studies have presented E-Government interoperability as the ability of government systems 
to communicate and share integrate information by using common standards [22]. In the first version of the 
European interoperability framework (EIF), three levels of interoperability were identified, namely: the 
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organizational level that addresses the processes and structures where organizations have to interact. The 
second, is the technical level that addresses the issues of linking systems and infrastructures to exchange data, 
and the last one which is the semantic level that addresses the issues of interpreting data by different 
organizations [23]. In the last version of the EIF published in 2017, a legal layer was added to the existing 
three levels of interoperability. The commission highlights the need of setting specific information systems 
and architectures that permit the continuity of public data [24]. 

The definition of technical interoperability has grown over the years. This category of 
interoperability does not rely uniquely to infrastructures and communication protocols to connect systems 
[25], [26], it includes strategies for storing data in a shared central repositories [27]. In fact, to promote and 
support interoperability in the public administration, serious planning should be implemented to ensure 
compatibility and storage interoperability of different public institution platforms. In fact, public 
organizations around the world as the major creators of data, need support in understanding how to harness 
technologies to overcome the increasing amount of data [28]. In addition, predictions states that the global 
data sphere-will attend 175 ZB in 2025 [29]. In reality, the exponential growth of public data presents an 
opportunity for data storing technologies to exploit effectively data storage solutions [30]. Meanwhile, it 
should be mentioned that budget of public administrations is limited, and it is vital to find solutions to 
optimize the state budget for ICT costs [31]. 


2.2. Modern data centers: software defined storage concept (SDS) 

It is vital for public administrations to have a general approach to its data storage whether installed 
in cloud or on their own infrastructures. In other hand, as the critical aspect of the data maintained by 
government administrations, keeping data on its own servers (on promise solutions) are recommended to 
maintain a level of control that cloud usually cannot allow [32]. The need for flexible and easier manageable 
storage has contributed to the development of storage models. Software defined storage (SDS) concept 
changes the data center in practical ways: SDS abstracts the physical storage from the controlling software, 
unifies different storage solutions and put them in a central virtual store. This approach has attracted both 
academic and commercial community to overcome the deficiency of traditional storage infrastructures [33]. 
Unlike the traditional data storage solutions (SAN and Nas), SDS technique does not depend on the capacity 
of hardware, it is one of the promising approaches in the computer networking field, which enables storage 
resources to respond to the progressive changes in application requirements. Furthermore, SDS solution is an 
adequate technique to enhance efficiency, flexibility and effective cost [34]. Therefore, by unifying data and 
interfaces across different sources and presenting them in a single view, SDS solutions provide new 
approaches for managing and controlling data storage. 


2.3. Modern infrastructure: hyperconverged infrastructures (HCI) 

Hyperconvergence splits into hyper which means hypervisor and convergence. Hyperconverged 
infrastructure (HCI) is a set of SDS solutions that provide network, compute and storage in a single 
consolidated unified appliance [35]. It combines compute resources, memory and storage in a single 
platform. In HCI infrastructure, all components are virtualized which offers a flexible and resilient 
environment to companies [36], [37]. Instead of the growth adoption of SDS in companies system’s storage, 
academic research’s conducted in hyperconverged field is restricted [38]. HCI solutions based on x-86 
architecture helps enterprises to reduce expenses with better storage [39]. 

Data center architectures have moved from traditional infrastructure through converged to now a 
hyper-converged one: 

- The traditional infrastructure (three-tier) model relies on three separate units: Compute, storage, and 
network components. Each component is configured and managed separately. Moreover, support and 
warranty are managed individually. This requires different IT staff with expertise in all data center 
fields, 

- The converged infrastructure develops the traditional model by combining compute and storage into a 
single physical appliance that can be manageable centrally. Even if components are still separate 
physically, the global management is optimized, 

- The Hyper-converged infrastructure is the successor of converged infrastructures; it connects servers, 
storage, networking and management into a single infrastructure with intelligent software. This, create 
simple and efficient environment in data centers, optimize overall data center performance and 
eliminates the need to have different IT administrators. In addition, HCI solutions include software 
layer that enables to manage, deploy, and easily administer hardware resources from a unique interface. 

As shown in Figure 1, converged and hyper-converged infrastructures extended the capability of 
data centers. Due to the complexity of data storage, HCI architectures are attracting research academics to 
improve storage efficiency and reach high performance [41]. The major benefit is integrating all functions in 
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software. This new approach simplifies management data, and now it is seen as a future advent of data 
centers [42], [43]. The architecture we propose essentially completes the work of [44] by ensuring an 
efficient storage architecture for the central data warehouse presented as a solution for e-government 
interoperability. 
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Figure 1. Comparing non converged, converged and hyper-converged infrastructure [40] 


3. MATERIALS AND METHODS 
3.1. VMware vSAN HCI solution 

There are several solutions for HCI in the IT market. VMware is one of the leaders in the 
development of HCI solutions. Indeed, according to International Data Corporation (IDC) analysis, VMware 
vSAN powered HCI is one of the top three companies in the world that lead the HCI market in the third 
quarter of 2020 [45]. As seen in Table 1, IDC announces that systems running VMware HCI estimated for 
40.2% share of the market, and increased at 1.6% over 2019. 


Table 1. Top three companies market share source: IDC tracker, december 15, 2020 
Company 3Q20 Market Share _3Q19 Market Share 


VMware 40.2% 38.6% 
Nutanix 25.1% 27.1% 
Cisco 5.9% 5.4% 
Rest of Market 28.7% 28.9% 
Total 100% 100% 


VMware highlights three reasons to explain this remarkable growth: deployment flexibility, support 
for cloud-native applications and a path to the hybrid cloud [46]. For this, we propose an architecture based 
on VMware vSAN systems to store public data. First, we discuss the general requirements and the 
configurations of VMware vSAN solution, secondly, we present our architecture to store public data and 
ensure technical interoperability between different public information systems. 


3.2. Dimensionning vVSAN components 

To achieve optimal level of performance, planning the capabilities of hosts and the storage 
configuration is strongly recommended before deploying a vSAN solution in the vSphere environment. The 
environment must meet all requirements in: hardware, software, clusters and network. Therefore, based on 
information in the Vmware guide for vSAN planning designing and deployment [47], we design the key 
elements in a vSAN architecture, namely: the vVSAN Cluster, VSAN Storage Components, vSAN Hosts, and 
vSAN Network. At the same time, the proposed HCI platform must support replication, compression, 
duplication and ensure high availability. 


3.3. Sizing virtual SAN cluster 

A vSAN cluster must have at least three server nodes; each node includes its internal storage drives 
(SSD, SAS, or SATA) that are used to create disk groups to make the VSAN Datastore. The vSAN cluster 
uses Vmware features such as vMotion, to ensure high availability (HA) for the virtual machines and to avoid 
downtime for maintenance operations. Additionally, to protect site against failures (host failure, disk failure, 
and rack failure). VSAN features like failure to tolerate (FTT) and fault domains (FD) need to be activated. 
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The number N of hosts needed for the cluster is calculated as (1). 
N =2* FIT +1 (1) 
For example: If FTT=1, N=3, three hosts are required. If FTT=2, N=5, Five hosts are required 


3.4. Sizing virtual SAN datasore 

When sizing the VSAN datastore capacity (Cpa). We must consider the following major 
components: Expected virtual machine VMDK capacity (Cvm), FIT method and the virtual machine storage 
policy (VMSP). Expected virtual machine (VM) consumption Cerxp is calculated by multiplying the number 
of VMs with the storage capacity for each VM. 


C= NEC, (2) 
Where Cgxp: expected overall capacity, N: number of VMs in the cluster, Cvm: Storage capacity per VM. 


3.5. Failures to tolerate (FTT) 

This parameter defines how a virtual machine can survive host, hard drives or any other device 
failures. As we discussed earlier, it is related to the number of hosts in the vSAN Cluster. A value of FTT 
equals to 1, means that the hosted VM inside a vSAN datastore can tolerate one single failure whiteout 
impacting data integrity and availability. 


3.6. The virtual machine storage policy (VMSP) 

Vmware vSAN allows assigning for each Virtual Machine a storage policy that defines how the 
virtual machine disk (VMDK) is protected inside the vSAN Cluster. This parameter can be either RAID-1 
mirroring or RAID-5/6 (Erasure Coding). 

- If the VMSP is set to RAID-1 mirroring, the actual VMDK and its copy are both stored in the VSAN 
datastore; In this case the real needed capacity is calculated as: 


Creal = 2X Crxp (3) 
where Cpeai is the actual overall capacity. 


- If the VMSP is set to RAID-5/6, about 33% of extra space are added to store the VMDK and its parity. 
In this case the real needed capacity is calculated as (4). 


Creal = 1.33x Cexp (4) 


3.7. Datastore capacity (Cpa) 
First, we calculate the initial VSAN datastore capacity (Cpao) that the vSAN cluster will use to host 
virtual machine disks (VMDKs) as (5). 


Coao = Crea * (FTT +1) (5) 


Second, we preserve the vSAN capacity overhead Co wich represent about 30% of the initial 
datastore capacity Cpao for VSAN maintenance operations, as recommended by Vmware guide. 


Co = 0.3x Coao (6) 
The real vSAN datastore capacity Cpa is calculated as (7). 

Coa = Coao +Co (7) 
Figure 2 shows the flowchart of calculating the vVSAN Datastore capacity. 
3.8. Planning vSAN host 

Planning the configuration of the hosts in the vSAN cluster includes sizing memory and CPU in the 


vSAN cluster. We have to consider carefully the memory per virtual machine, per host and the disk group 
number per host. Equally, we have to calculate the number of vCPUs based on the expected number of VMs. 
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Figure 2. Flow chart of planning vSAN datastore capacity 


3.9. Planning vSAN network 

Planning the networking features offers availability, security, and bandwidth guarantee in a vVSAN 
cluster. For network load balancing, vSAN uses a teaming and failover policy to ensure the network 
redundancy. It includes teaming algorithm that defines how network traffic is rerouted in case of failure, and 
failover configuration that selects of the how the switchs distribute the traffic between network adapters in 
team. 


4.0. Planning vSAN disk group 

Each server of the vSAN cluster must contain at least one disk group that have one device used for 
cashing and between | to 7 other hard drives used for capacity. A disk group is a combined of physical 
storage capacity on a host and a group of physical devices that furnish the required performance to the vVSAN 
cluster. Each disk group must include one flash cache device and one or multiple capacity devices. Likewise, 
to achieve a high performance in read and write throughput and low latency, especially when hosting 
databases, using devices of flash type like SSD or NVME of smaller capacity is recommended. 


4. RESULTS OF DIMENSIONING vSAN HCI ARCHITECTURE 

In order to build an interoperable solution for data sharing technologies, organizations must store all 
their data in a central data warehouse. The data warehouse is a system destined to store data from single and 
multiple sources [48]. For this purpose, we propose to host each public service local data warehouse in 
dedicated virtual machines (VMs). Each VM is hosted on the vSAN Datasore, and every public 
administration can transfer a copy of its local data warehouse to its corresponding VM. This allows 
exchanging data in an interoperable manner without transformations. 

To ensure the high availability, we create two sites, the principal site “Production Site” and the 
secondary one is “Disaster Recovery Site” as depicted in Figure 3. With vCenter HA, we configure the High 
availability between the two sites. vSphere HA cluster elects automatically a single host as the primary host. 
The primary host communicates with vCenter Server and checks the state of virtual machines. 
Interconnection between the two sites is performed by a leased line connection (LL), as a reliable liaison, to 
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ensure continuous data flow. Aditionally, public administrations can use just a broadband connection and 
initiate a secure protocol for the transfer of data as secure shell file transfer program (SFTP), virtual private 
networks (VPN), FTP over transport layer security (TLS), FTP over secure shell (SSH). 

VMware site recovery manager (SRM) is proposed for disaster recovery management and 
automation. This extension coordinates VMware vSphere Replication solution to automate the process of 
recovering. By using the data replicated from the “Production Site”, virtual machines assume the safe 
provision of services. This VMware vSAN architecture takes charge of providing a rapid business continuity 
solution. 
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Figure 3. The Proposed vSAN architecture 


5. CONCLUSION 

In this paper, we concentrated on developing an architecture based on hyperconverged infrastructure 
for e-government interoperability that permits storing public data from different public data warehouses. We 
presented a global architecture for technical interoperability that ensures high availability, fault tolerance and 
performance in vSAN environment. In the future, our architecture will present a real case study applied to 
agency of digital development (ADD) to target practicable features like managing the storage of data from 
different public agencies and guarantee technical interoperability for public administration data. 
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